novel action
Safely Exploring Novel Actions in Recommender Systems via Deployment-Efficient Policy Learning
Kiyohara, Haruka, Narita, Yusuke, Saito, Yuta, Tateno, Kei, Udagawa, Takuma
In many real recommender systems, novel items are added frequently over time. The importance of sufficiently presenting novel actions has widely been acknowledged for improving long-term user engagement. A recent work builds on Off-Policy Learning (OPL), which trains a policy from only logged data, however, the existing methods can be unsafe in the presence of novel actions. Our goal is to develop a framework to enforce exploration of novel actions with a guarantee for safety. To this end, we first develop Safe Off-Policy Policy Gradient (Safe OPG), which is a model-free safe OPL method based on a high confidence off-policy evaluation. In our first experiment, we observe that Safe OPG almost always satisfies a safety requirement, even when existing methods violate it greatly. However, the result also reveals that Safe OPG tends to be too conservative, suggesting a difficult tradeoff between guaranteeing safety and exploring novel actions. To overcome this tradeoff, we also propose a novel framework called Deployment-Efficient Policy Learning for Safe User Exploration, which leverages safety margin and gradually relaxes safety regularization during multiple (not many) deployments. Our framework thus enables exploration of novel actions while guaranteeing safe implementation of recommender systems.
Open Set Action Recognition via Multi-Label Evidential Learning
Zhao, Chen, Du, Dawei, Hoogs, Anthony, Funk, Christopher
Existing methods for open-set action recognition focus on novelty detection that assumes video clips show a single action, which is unrealistic in the real world. We propose a new method for open set action recognition and novelty detection via MUlti-Label Evidential learning (MULE), that goes beyond previous novel action detection methods by addressing the more general problems of single or multiple actors in the same scene, with simultaneous action(s) by any actor. Our Beta Evidential Neural Network estimates multi-action uncertainty with Beta densities based on actor-context-object relation representations. An evidence debiasing constraint is added to the objective function for optimization to reduce the static bias of video representations, which can incorrectly correlate predictions and static cues. We develop a learning algorithm based on a primal-dual average scheme update to optimize the proposed problem. Theoretical analysis of the optimization algorithm demonstrates the convergence of the primal solution sequence and bounds for both the loss function and the debiasing constraint. Uncertainty and belief-based novelty estimation mechanisms are formulated to detect novel actions. Extensive experiments on two real-world video datasets show that our proposed approach achieves promising performance in single/multi-actor, single/multi-action settings.
A Dynamic Neural Network Approach to Generating Robot's Novel Actions: A Simulation Experiment
In this study, we investigate how a robot can generate novel and creative actions from its own experience of learning basic actions. Inspired by a machine learning approach to computational creativity, we propose a dynamic neural network model that can learn and generate robot's actions. We conducted a set of simulation experiments with a humanoid robot. The results showed that the proposed model was able to learn the basic actions and also to generate novel actions by modulating and combining those learned actions. The analysis on the neural activities illustrated that the ability to generate creative actions emerged from the model's nonlinear memory structure self-organized during training. The results also showed that the different way of learning the basic actions induced the self-organization of the memory structure with the different characteristics, resulting in the generation of different levels of creative actions. Our approach can be utilized in human-robot interaction in which a user can interactively explore the robot's memory to control its behavior and also discover other novel actions. If the robot is only capable of reproducing the behaviors that it has learned, the user might easily lose his/her interests in the interaction with the robot. In addition, it is cumbersome for the user to teach every single behavior of the robot.
Tool Use Learning in Robots
Brown, Solly (University of New South Wales) | Sammut, Claude (University of New South Wales)
Learning to use an object as a tool requires understanding what goals it helps to achieve, the properties of the tool that make it useful and how the tool must be manipulated to achieve the goal. We present a method that allows a robot to learn about objects in this way and thereby employ them as tools. An initial hypothesis for an action model of tool use is created by observing another agent accomplishing a task using a tool. The robot then refines its hypothesis by active learning, generating new experiments and observing the outcomes. Hypotheses are updated using Inductive Logic Programming. One of the novel aspects of this work is the method used to select experiments so that the search through the hypothesis space is minimised.